The Truth About Effects, Capabilities and Effect Subset Constraints

An effect or capability is a set of (compile-time) region names. We use effect to refer to the region names that must be “live” for some expression to type-check and capability to refer to the region names that are “live” at some program point. A effect subset constraint indicates that all region names that appear in one effect qualifier also appear in another effect qualifier. Each program point has a set of “known effect subset relations”.

The intuition is that a program point’s capability and subset relations must imply that an expression’s effect describes live regions, else the expression does not type-check. As we’ll see, default effects for functions were carefully designed so that most Cyclone code runs no risk of such an “effect check” ever failing. But using existential types effectively requires a more complete understanding of the system, though perhaps not as complete as this section presents.

The form of effects or capabilities is described as follows:

The description of regions(t) appears circular, but in fact if we gave the definition for each form of types, it would not be. The point is that regions(a) is an "atomic effect" in the sense that it stands for a set of regions that cannot be further decomposed without knowing a. The primary uses of regions(t) are descibed below.

The form of an effect subset constraint is e1 <= e2 where e1 and e2 are both effects.

We now describe the capability at each program point. On function entry, the capability is the function’s effect (typically the regions of the parameters and result, but fully described below). In a local block or a growable-region statement, the capability is the capability of the enclosing context plus the block/statement’s region name.

The known effect subset relation at a program point are only those that are mentioned in the type of the function within which the program point occurs.

We can now describe an expression’s effect: If it reads or writes to memory described by a region name r, then the effect contains {r}. If it calls a function with effect e, then the effect conatins e. Every function (type) has an effect, but programmers almost never write down an explicit effect. To do so, one puts “; e” at the end of the parameter list, wehre e is an effect. For example, we could write:

`a id(`a x; {}) { return x; }

because the function does not access any memory.

If a function takes parameters of types t1,…,tn and returns a value of type t, the default effect is simply regions(t1)+…+regions(tn)+regions(t). In English, the default assumption is that a function may dereference any pointers it is passed, so the corresponding regions must be live. In our example above, the default effect would have been regions(a). If the caller had instantiateda with int*r, then with the default effect, the type-checker would requirer to be live, but with the explicit effect {} it would not. However, dangling pointers can be created only when using existential types, so the difference is rarely noticed.

By default, a function (type) has no effect subset constraints. That is, the function assumes no relationship between all the effect variables that appear in its type. Adding explicit subset relationships enables more subtyping in the callee and more stringent requirements at the call site (namely that the relationship holds).

We can describe when a capability e and a set of effect subset relations s imply an effect, although your intuition probably suffices. A “normalized effect” is either {} or unions of “atomic effects”, where an atomic effect is either {r} or regions(a). For any effect e1, we can easily compute an equivalent normalized effect e2. Now, e and s imply e1 if they imply every {r} and regions(a) in e2. To imply {r} (or regions(a)), we need {r} (or regions(a)) to be in (a normalized effect of) e) or for b to contain some r <=r2 such that e and b imply `r2. Or something like that.

All of these complications are unnecessary except for existential types, to which we now return. Explicit bounds are usually necessary to make values of existential types usable. To see why, consider the example from the previous section, with the struct declaration changed to remove the explicit bound:

struct T {<`a> 
  `a f1; 
  int f(`a, int); 
};

(So the declaration of t should just have type struct T.) Now the function call f(arg,19) at the end of g will not type-check because the (default) effect for f includes regions(b), which cannot be established at the call site. But with the bound, we know that regions(b) <=`H, which is sufficient to prove the call won’t read through any dangling pointers.